Overview

Dataset info

Number of variables41
Number of observations59400
Missing cells46094 (1.9%)
Duplicate rows0 (0.0%)
Total size in memory21.5 MiB
Average record size in memory380.1 B

Variables types

Numeric10
Categorical27
Boolean2
Date0
URL0
Text (Unique)0
Rejected2
Unsupported0

Warnings

amount_tsh is highly skewed (γ1 = 57.80779995) Skewed
amount_tsh has 41639 (70.1%) zeros Zeros
construction_year has 20709 (34.9%) zeros Zeros
date_recorded only contains datetime values, but is categorical. Consider applying pd.to_datetime()Type
date_recorded has a high cardinality: 356 distinct values Warning
funder has a high cardinality: 1898 distinct values Warning
funder has 3635 (6.1%) missing values Missing
gps_height has 20438 (34.4%) zeros Zeros
installer has a high cardinality: 2146 distinct values Warning
installer has 3655 (6.2%) missing values Missing
lga has a high cardinality: 125 distinct values Warning
longitude has 1812 (3.1%) zeros Zeros
num_private is highly skewed (γ1 = 91.93374999) Skewed
num_private has 58643 (98.7%) zeros Zeros
permit has 3056 (5.1%) missing values Missing
population has 21381 (36.0%) zeros Zeros
public_meeting has 3334 (5.6%) missing values Missing
quantity_group is a recoding of quantityRejected
recorded_by has constant value "GeoData Consultants Ltd" Rejected
scheme_management has 3877 (6.5%) missing values Missing
scheme_name has a high cardinality: 2697 distinct values Warning
scheme_name has 28166 (47.4%) missing values Missing
subvillage has a high cardinality: 19288 distinct values Warning
ward has a high cardinality: 2092 distinct values Warning
wpt_name has a high cardinality: 37400 distinct values Warning

Variables

amount_tsh
Numeric

Distinct count98
Unique (%)0.2%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean317.6503847
Minimum0
Maximum350000
Zeros (%)70.1%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
Median0
Q320
95-th percentile1200
Maximum350000
Range350000
Interquartile range20

Descriptive statistics

Standard deviation2997.574558
Coef of variation9.436709989
Kurtosis4903.543102
Mean317.6503847
MAD522.1244629
Skewness57.80779995
Sum18868432.85
Variance8985453.232
Memory size928.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[0.000e+00 1.000e-01 3.500e+00 6.500e+00 8.000e+00 ... 2.250e+04 5.500e+04 1.085e+05 1.185e+05 3.500e+05], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 41639 70.1%
 
500 3102 5.2%
 
50 2472 4.2%
 
1000 1488 2.5%
 
20 1463 2.5%
 
200 1220 2.1%
 
100 816 1.4%
 
10 806 1.4%
 
30 743 1.3%
 
2000 704 1.2%
 
Other values (88) 4947 8.3%
 

Minimum 5 values

ValueCountFrequency (%) 
0 41639 70.1%
 
0.2 3 < 0.1%
 
0.25 1 < 0.1%
 
1 3 < 0.1%
 
2 13 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
350000 1 < 0.1%
 
250000 1 < 0.1%
 
200000 1 < 0.1%
 
170000 1 < 0.1%
 
138000 1 < 0.1%
 

basin
Categorical

Distinct count9
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
Lake Victoria
10248
Pangani
8940
Rufiji
7976
Other values (6)
32236
ValueCountFrequency (%) 
Lake Victoria 10248 17.3%
 
Pangani 8940 15.1%
 
Rufiji 7976 13.4%
 
Internal 7785 13.1%
 
Lake Tanganyika 6432 10.8%
 
Wami / Ruvu 5987 10.1%
 
Lake Nyasa 5085 8.6%
 
Ruvuma / Southern Coast 4493 7.6%
 
Lake Rukwa 2454 4.1%
 
Max length23
Mean length10.8923569
Min length6
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

construction_year
Numeric

Distinct count55
Unique (%)0.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean1300.652475
Minimum0
Maximum2013
Zeros (%)34.9%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
Median1986
Q32004
95-th percentile2010
Maximum2013
Range2013
Interquartile range2004

Descriptive statistics

Standard deviation951.6205473
Coef of variation0.7316485885
Kurtosis-1.596432369
Mean1300.652475
MAD906.9094983
Skewness-0.6349277866
Sum77258757
Variance905581.6661
Memory size928.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[ 0. 980. 1960.5 1962.5 1963.5 ... 2007.5 2010.5 2011.5 2012.5 2013. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 20709 34.9%
 
2010 2645 4.5%
 
2008 2613 4.4%
 
2009 2533 4.3%
 
2000 2091 3.5%
 
2007 1587 2.7%
 
2006 1471 2.5%
 
2003 1286 2.2%
 
2011 1256 2.1%
 
2004 1123 1.9%
 
Other values (45) 22086 37.2%
 

Minimum 5 values

ValueCountFrequency (%) 
0 20709 34.9%
 
1960 102 0.2%
 
1961 21 < 0.1%
 
1962 30 0.1%
 
1963 85 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
2013 176 0.3%
 
2012 1084 1.8%
 
2011 1256 2.1%
 
2010 2645 4.5%
 
2009 2533 4.3%
 

date_recorded
Categorical

Distinct count356
Unique (%)0.6%
Missing (%)0.0%
Missing (n)0
2011-03-15
 
572
2011-03-17
 
558
2013-02-03
 
546
Other values (353)
57724
ValueCountFrequency (%) 
2011-03-15 572 1.0%
 
2011-03-17 558 0.9%
 
2013-02-03 546 0.9%
 
2011-03-14 520 0.9%
 
2011-03-16 513 0.9%
 
2011-03-18 497 0.8%
 
2011-03-19 466 0.8%
 
2013-02-04 464 0.8%
 
2013-01-29 459 0.8%
 
2011-03-04 458 0.8%
 
Other values (346) 54347 91.5%
 
Max length10
Mean length10
Min length10
Contains charsFalse
Contains digitsTrue
Contains spacesFalse
Contains non-wordsTrue

district_code
Numeric

Distinct count20
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean5.629747475
Minimum0
Maximum80
Zeros (%)< 0.1%
Mini histogram

Quantile statistics

Minimum0
5-th percentile1
Q12
Median3
Q35
95-th percentile30
Maximum80
Range80
Interquartile range3

Descriptive statistics

Standard deviation9.633648629
Coef of variation1.711204396
Kurtosis16.21428363
Mean5.629747475
MAD4.743533803
Skewness3.962045299
Sum334407
Variance92.80718592
Memory size928.1 KiB
Histogram
Histogram with fixed size bins (bins=20)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 3.5 ... 48. 56.5 61. 65. 80. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 12203 20.5%
 
2 11173 18.8%
 
3 9998 16.8%
 
4 8999 15.1%
 
5 4356 7.3%
 
6 4074 6.9%
 
7 3343 5.6%
 
8 1043 1.8%
 
30 995 1.7%
 
33 874 1.5%
 
Other values (10) 2342 3.9%
 

Minimum 5 values

ValueCountFrequency (%) 
0 23 < 0.1%
 
1 12203 20.5%
 
2 11173 18.8%
 
3 9998 16.8%
 
4 8999 15.1%
 

Maximum 5 values

ValueCountFrequency (%) 
80 12 < 0.1%
 
67 6 < 0.1%
 
63 195 0.3%
 
62 109 0.2%
 
60 63 0.1%
 

extraction_type
Categorical

Distinct count18
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
gravity
26780
nira/tanira
8154
other
6430
Other values (15)
18036
ValueCountFrequency (%) 
gravity 26780 45.1%
 
nira/tanira 8154 13.7%
 
other 6430 10.8%
 
submersible 4764 8.0%
 
swn 80 3670 6.2%
 
mono 2865 4.8%
 
india mark ii 2400 4.0%
 
afridev 1770 3.0%
 
ksb 1415 2.4%
 
other - rope pump 451 0.8%
 
Other values (8) 701 1.2%
 
Max length25
Mean length7.719511785
Min length3
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

extraction_type_class
Categorical

Distinct count7
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
gravity
26780
handpump
16456
other
6430
Other values (4)
9734
ValueCountFrequency (%) 
gravity 26780 45.1%
 
handpump 16456 27.7%
 
other 6430 10.8%
 
submersible 6179 10.4%
 
motorpump 2987 5.0%
 
rope pump 451 0.8%
 
wind-powered 117 0.2%
 
Max length12
Mean length7.602239057
Min length5
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

extraction_type_group
Categorical

Distinct count13
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
gravity
26780
nira/tanira
8154
other
6430
Other values (10)
18036
ValueCountFrequency (%) 
gravity 26780 45.1%
 
nira/tanira 8154 13.7%
 
other 6430 10.8%
 
submersible 6179 10.4%
 
swn 80 3670 6.2%
 
mono 2865 4.8%
 
india mark ii 2400 4.0%
 
afridev 1770 3.0%
 
rope pump 451 0.8%
 
other handpump 364 0.6%
 
Other values (3) 337 0.6%
 
Max length15
Mean length7.880538721
Min length4
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

funder
Categorical

Distinct count1898
Unique (%)3.2%
Missing (%)6.1%
Missing (n)3635
Government Of Tanzania
9084
Danida
 
3114
Hesawa
 
2202
Other values (1894)
41365
(Missing)
 
3635
ValueCountFrequency (%) 
Government Of Tanzania 9084 15.3%
 
Danida 3114 5.2%
 
Hesawa 2202 3.7%
 
Rwssp 1374 2.3%
 
World Bank 1349 2.3%
 
Kkkt 1287 2.2%
 
World Vision 1246 2.1%
 
Unicef 1057 1.8%
 
Tasaf 877 1.5%
 
District Council 843 1.4%
 
Other values (1887) 33332 56.1%
 
(Missing) 3635 6.1%
 
Max length30
Mean length9.505824916
Min length1
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

gps_height
Numeric

Distinct count2428
Unique (%)4.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean668.2972391
Minimum-90
Maximum2770
Zeros (%)34.4%
Mini histogram

Quantile statistics

Minimum-90
5-th percentile0
Q10
Median369
Q31319.25
95-th percentile1797
Maximum2770
Range2860
Interquartile range1319.25

Descriptive statistics

Standard deviation693.1163503
Coef of variation1.037137833
Kurtosis-1.292440135
Mean668.2972391
MAD637.9529678
Skewness0.462402085
Sum39696856
Variance480410.2751
Memory size928.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[ -90. -58. -50.5 -40.5 -28.5 ... 2180.5 2200.5 2366.5 2627.5 2770. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 20438 34.4%
 
-15 60 0.1%
 
-16 55 0.1%
 
-13 55 0.1%
 
-20 52 0.1%
 
1290 52 0.1%
 
-14 51 0.1%
 
303 51 0.1%
 
-18 49 0.1%
 
-19 47 0.1%
 
Other values (2418) 38490 64.8%
 

Minimum 5 values

ValueCountFrequency (%) 
-90 1 < 0.1%
 
-63 2 < 0.1%
 
-59 1 < 0.1%
 
-57 1 < 0.1%
 
-55 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
2770 1 < 0.1%
 
2628 1 < 0.1%
 
2627 1 < 0.1%
 
2626 2 < 0.1%
 
2623 1 < 0.1%
 

id
Numeric

Distinct count59400
Unique (%)100.0%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean37115.13177
Minimum0
Maximum74247
Zeros (%)< 0.1%
Mini histogram

Quantile statistics

Minimum0
5-th percentile3730.9
Q118519.75
Median37061.5
Q355656.5
95-th percentile70564.05
Maximum74247
Range74247
Interquartile range37136.75

Descriptive statistics

Standard deviation21453.12837
Coef of variation0.5780156866
Kurtosis-1.201515029
Mean37115.13177
MAD18586.04643
Skewness0.00262253035
Sum2204638827
Variance460236716.9
Memory size3.4 MiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[ 0. 74247.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2047 1 < 0.1%
 
72310 1 < 0.1%
 
49805 1 < 0.1%
 
51852 1 < 0.1%
 
62091 1 < 0.1%
 
64138 1 < 0.1%
 
57993 1 < 0.1%
 
60040 1 < 0.1%
 
33413 1 < 0.1%
 
35460 1 < 0.1%
 
Other values (59390) 59390 > 99.9%
 

Minimum 5 values

ValueCountFrequency (%) 
0 1 < 0.1%
 
1 1 < 0.1%
 
2 1 < 0.1%
 
3 1 < 0.1%
 
4 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
74247 1 < 0.1%
 
74246 1 < 0.1%
 
74243 1 < 0.1%
 
74242 1 < 0.1%
 
74240 1 < 0.1%
 

installer
Categorical

Distinct count2146
Unique (%)3.6%
Missing (%)6.2%
Missing (n)3655
DWE
17402
Government
 
1825
RWE
 
1206
Other values (2142)
35312
(Missing)
 
3655
ValueCountFrequency (%) 
DWE 17402 29.3%
 
Government 1825 3.1%
 
RWE 1206 2.0%
 
Commu 1060 1.8%
 
DANIDA 1050 1.8%
 
KKKT 898 1.5%
 
Hesawa 840 1.4%
 
0 777 1.3%
 
TCRS 707 1.2%
 
Central government 622 1.0%
 
Other values (2135) 29358 49.4%
 
(Missing) 3655 6.2%
 
Max length30
Mean length5.91976431
Min length1
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

latitude
Numeric

Distinct count57517
Unique (%)96.8%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean-5.70603266
Minimum-11.64944018
Maximum-2e-08
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum-11.64944018
5-th percentile-10.58554992
Q1-8.540621305
Median-5.02159665
Q3-3.32615564
95-th percentile-1.408872227
Maximum-2e-08
Range11.64944016
Interquartile range5.214465665

Descriptive statistics

Standard deviation2.946019081
Coef of variation-0.5162990219
Kurtosis-1.057616666
Mean-5.70603266
MAD2.56776991
Skewness-0.1520365709
Sum-338938.34
Variance8.679028427
Memory size928.1 KiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[-1.16494402e+01 -1.15676907e+01 -1.14763553e+01 -1.14412923e+01 -1.13237074e+01 ... -1.19709397e+00 -1.14437511e+00 -9.98690175e-01 -4.99232185e-01 -2.00000000e-08], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-2e-08 1812 3.1%
 
-6.99054864 2 < 0.1%
 
-2.48937845 2 < 0.1%
 
-2.51532072 2 < 0.1%
 
-6.96356538 2 < 0.1%
 
-2.5042939 2 < 0.1%
 
-9.2893492 2 < 0.1%
 
-2.51063865 2 < 0.1%
 
-6.99129411 2 < 0.1%
 
-2.48708461 2 < 0.1%
 
Other values (57507) 57570 96.9%
 

Minimum 5 values

ValueCountFrequency (%) 
-11.64944018 1 < 0.1%
 
-11.64837759 1 < 0.1%
 
-11.58629656 1 < 0.1%
 
-11.56857679 1 < 0.1%
 
-11.56680457 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
-2e-08 1812 3.1%
 
-0.99846435 1 < 0.1%
 
-0.998916 1 < 0.1%
 
-0.99901209 1 < 0.1%
 
-0.99911702 1 < 0.1%
 

lga
Categorical

Distinct count125
Unique (%)0.2%
Missing (%)0.0%
Missing (n)0
Njombe
 
2503
Arusha Rural
 
1252
Moshi Rural
 
1251
Other values (122)
54394
ValueCountFrequency (%) 
Njombe 2503 4.2%
 
Arusha Rural 1252 2.1%
 
Moshi Rural 1251 2.1%
 
Bariadi 1177 2.0%
 
Rungwe 1106 1.9%
 
Kilosa 1094 1.8%
 
Kasulu 1047 1.8%
 
Mbozi 1034 1.7%
 
Meru 1009 1.7%
 
Bagamoyo 997 1.7%
 
Other values (115) 46930 79.0%
 
Max length16
Mean length7.416885522
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

longitude
Numeric

Distinct count57516
Unique (%)96.8%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean34.07742669
Minimum0
Maximum40.34519307
Zeros (%)3.1%
Mini histogram

Quantile statistics

Minimum0
5-th percentile30.04066001
Q133.09034738
Median34.90874343
Q337.17838657
95-th percentile39.13323954
Maximum40.34519307
Range40.34519307
Interquartile range4.08803919

Descriptive statistics

Standard deviation6.567431846
Coef of variation0.1927208854
Kurtosis19.18703105
Mean34.07742669
MAD3.302270448
Skewness-4.191046455
Sum2024199.146
Variance43.13116105
Memory size3.4 MiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[ 0. 14.80356095 29.60716149 29.63885953 29.68126761 ... 39.67089348 39.88985935 40.10245293 40.20239876 40.34519307], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1812 3.1%
 
39.08887513 2 < 0.1%
 
39.10530661 2 < 0.1%
 
37.54340145 2 < 0.1%
 
38.18053774 2 < 0.1%
 
32.98856004 2 < 0.1%
 
32.99327684 2 < 0.1%
 
39.09309544 2 < 0.1%
 
39.10124424 2 < 0.1%
 
32.96700926 2 < 0.1%
 
Other values (57506) 57570 96.9%
 

Minimum 5 values

ValueCountFrequency (%) 
0 1812 3.1%
 
29.6071219 1 < 0.1%
 
29.60720109 1 < 0.1%
 
29.61032056 1 < 0.1%
 
29.61096482 1 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
40.34519307 1 < 0.1%
 
40.34430089 1 < 0.1%
 
40.32523996 1 < 0.1%
 
40.32522643 1 < 0.1%
 
40.32340181 1 < 0.1%
 

management
Categorical

Distinct count12
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
vwc
40507
wug
 
6515
water board
 
2933
Other values (9)
9445
ValueCountFrequency (%) 
vwc 40507 68.2%
 
wug 6515 11.0%
 
water board 2933 4.9%
 
wua 2535 4.3%
 
private operator 1971 3.3%
 
parastatal 1768 3.0%
 
water authority 904 1.5%
 
other 844 1.4%
 
company 685 1.2%
 
unknown 561 0.9%
 
Other values (2) 177 0.3%
 
Max length16
Mean length4.350639731
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

management_group
Categorical

Distinct count5
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
user-group
52490
commercial
 
3638
parastatal
 
1768
Other values (2)
 
1504
ValueCountFrequency (%) 
user-group 52490 88.4%
 
commercial 3638 6.1%
 
parastatal 1768 3.0%
 
other 943 1.6%
 
unknown 561 0.9%
 
Max length10
Mean length9.892289562
Min length5
Contains charsTrue
Contains digitsFalse
Contains spacesFalse
Contains non-wordsTrue

num_private
Numeric

Distinct count65
Unique (%)0.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean0.4741414141
Minimum0
Maximum1776
Zeros (%)98.7%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
Median0
Q30
95-th percentile0
Maximum1776
Range1776
Interquartile range0

Descriptive statistics

Standard deviation12.23622981
Coef of variation25.80713147
Kurtosis11137.29521
Mean0.4741414141
MAD0.9361978097
Skewness91.93374999
Sum28164
Variance149.72532
Memory size3.4 MiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[0.000e+00 5.000e-01 1.500e+00 4.500e+00 5.500e+00 ... 9.800e+01 1.065e+02 1.550e+02 7.265e+02 1.776e+03], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 58643 98.7%
 
6 81 0.1%
 
1 73 0.1%
 
5 46 0.1%
 
8 46 0.1%
 
32 40 0.1%
 
45 36 0.1%
 
15 35 0.1%
 
39 30 0.1%
 
93 28 < 0.1%
 
Other values (55) 342 0.6%
 

Minimum 5 values

ValueCountFrequency (%) 
0 58643 98.7%
 
1 73 0.1%
 
2 23 < 0.1%
 
3 27 < 0.1%
 
4 20 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
1776 1 < 0.1%
 
1402 1 < 0.1%
 
755 1 < 0.1%
 
698 1 < 0.1%
 
672 1 < 0.1%
 

payment
Categorical

Distinct count7
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
never pay
25348
pay per bucket
8985
pay monthly
8300
Other values (4)
16767
ValueCountFrequency (%) 
never pay 25348 42.7%
 
pay per bucket 8985 15.1%
 
pay monthly 8300 14.0%
 
unknown 8157 13.7%
 
pay when scheme fails 3914 6.6%
 
pay annually 3642 6.1%
 
other 1054 1.8%
 
Max length21
Mean length10.66479798
Min length5
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

payment_type
Categorical

Distinct count7
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
never pay
25348
per bucket
8985
monthly
8300
Other values (4)
16767
ValueCountFrequency (%) 
never pay 25348 42.7%
 
per bucket 8985 15.1%
 
monthly 8300 14.0%
 
unknown 8157 13.7%
 
on failure 3914 6.6%
 
annually 3642 6.1%
 
other 1054 1.8%
 
Max length10
Mean length8.530757576
Min length5
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

permit
Boolean

Distinct count3
Unique (%)< 0.1%
Missing (%)5.1%
Missing (n)3056
True
38852
False
17492
(Missing)
 
3056
ValueCountFrequency (%) 
True 38852 65.4%
 
False 17492 29.4%
 
(Missing) 3056 5.1%
 

population
Numeric

Distinct count1049
Unique (%)1.8%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean179.9099832
Minimum0
Maximum30500
Zeros (%)36.0%
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
Median25
Q3215
95-th percentile680
Maximum30500
Range30500
Interquartile range215

Descriptive statistics

Standard deviation471.4821757
Coef of variation2.620655994
Kurtosis402.2801153
Mean179.9099832
MAD214.6976938
Skewness12.66071359
Sum10686653
Variance222295.442
Memory size3.4 MiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[0.00000e+00 5.00000e-01 1.50000e+00 4.50000e+00 5.50000e+00 ... 5.00800e+03 6.88800e+03 6.96100e+03 1.07315e+04 3.05000e+04], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 21381 36.0%
 
1 7025 11.8%
 
200 1940 3.3%
 
150 1892 3.2%
 
250 1681 2.8%
 
300 1476 2.5%
 
100 1146 1.9%
 
50 1139 1.9%
 
500 1009 1.7%
 
350 986 1.7%
 
Other values (1039) 19725 33.2%
 

Minimum 5 values

ValueCountFrequency (%) 
0 21381 36.0%
 
1 7025 11.8%
 
2 4 < 0.1%
 
3 4 < 0.1%
 
4 13 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
30500 1 < 0.1%
 
15300 1 < 0.1%
 
11463 1 < 0.1%
 
10000 3 < 0.1%
 
9865 1 < 0.1%
 

public_meeting
Boolean

Distinct count3
Unique (%)< 0.1%
Missing (%)5.6%
Missing (n)3334
True
51011
False
 
5055
(Missing)
 
3334
ValueCountFrequency (%) 
True 51011 85.9%
 
False 5055 8.5%
 
(Missing) 3334 5.6%
 

quality_group
Categorical

Distinct count6
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
good
50818
salty
 
5195
unknown
 
1876
Other values (3)
 
1511
ValueCountFrequency (%) 
good 50818 85.6%
 
salty 5195 8.7%
 
unknown 1876 3.2%
 
milky 804 1.4%
 
colored 490 0.8%
 
fluoride 217 0.4%
 
Max length8
Mean length4.23510101
Min length4
Contains charsTrue
Contains digitsFalse
Contains spacesFalse
Contains non-wordsFalse

quantity
Categorical

Distinct count5
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
enough
33186
insufficient
15129
dry
 
6246
Other values (2)
 
4839
ValueCountFrequency (%) 
enough 33186 55.9%
 
insufficient 15129 25.5%
 
dry 6246 10.5%
 
seasonal 4050 6.8%
 
unknown 789 1.3%
 
Max length12
Mean length7.362373737
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesFalse
Contains non-wordsFalse

quantity_group
Recoded

This variable is a recoding of quantity and should be ignored for analysis

recorded_by
Constant

This variable is constant and should be ignored for analysis

Constant valueGeoData Consultants Ltd

region
Categorical

Distinct count21
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
Iringa
 
5294
Shinyanga
 
4982
Mbeya
 
4639
Other values (18)
44485
ValueCountFrequency (%) 
Iringa 5294 8.9%
 
Shinyanga 4982 8.4%
 
Mbeya 4639 7.8%
 
Kilimanjaro 4379 7.4%
 
Morogoro 4006 6.7%
 
Arusha 3350 5.6%
 
Kagera 3316 5.6%
 
Mwanza 3102 5.2%
 
Kigoma 2816 4.7%
 
Ruvuma 2640 4.4%
 
Other values (11) 20876 35.1%
 
Max length13
Mean length6.623754209
Min length4
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

region_code
Numeric

Distinct count27
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean15.29700337
Minimum1
Maximum99
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum1
5-th percentile2
Q15
Median12
Q317
95-th percentile60
Maximum99
Range98
Interquartile range12

Descriptive statistics

Standard deviation17.58740634
Coef of variation1.149728866
Kurtosis10.28843341
Mean15.29700337
MAD9.486968586
Skewness3.17381811
Sum908642
Variance309.3168617
Memory size3.4 MiB
Histogram
Histogram with fixed size bins (bins=27)
Histogram
Histogram with variable size bins (bins=[ 1. 1.5 2.5 3.5 4.5 ... 32. 50. 70. 85. 99. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
11 5300 8.9%
 
17 5011 8.4%
 
12 4639 7.8%
 
3 4379 7.4%
 
5 4040 6.8%
 
18 3324 5.6%
 
19 3047 5.1%
 
2 3024 5.1%
 
16 2816 4.7%
 
10 2640 4.4%
 
Other values (17) 21180 35.7%
 

Minimum 5 values

ValueCountFrequency (%) 
1 2201 3.7%
 
2 3024 5.1%
 
3 4379 7.4%
 
4 2513 4.2%
 
5 4040 6.8%
 

Maximum 5 values

ValueCountFrequency (%) 
99 423 0.7%
 
90 917 1.5%
 
80 1238 2.1%
 
60 1025 1.7%
 
40 1 < 0.1%
 

scheme_management
Categorical

Distinct count13
Unique (%)< 0.1%
Missing (%)6.5%
Missing (n)3877
VWC
36793
WUG
 
5206
Water authority
 
3153
Other values (9)
10371
(Missing)
 
3877
ValueCountFrequency (%) 
VWC 36793 61.9%
 
WUG 5206 8.8%
 
Water authority 3153 5.3%
 
WUA 2883 4.9%
 
Water Board 2748 4.6%
 
Parastatal 1680 2.8%
 
Private operator 1063 1.8%
 
Company 1061 1.8%
 
Other 766 1.3%
 
SWC 97 0.2%
 
Other values (2) 73 0.1%
 
(Missing) 3877 6.5%
 
Max length16
Mean length4.537373737
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

scheme_name
Categorical

Distinct count2697
Unique (%)4.5%
Missing (%)47.4%
Missing (n)28166
K
 
682
None
 
644
Borehole
 
546
Other values (2693)
29362
(Missing)
28166
ValueCountFrequency (%) 
K 682 1.1%
 
None 644 1.1%
 
Borehole 546 0.9%
 
Chalinze wate 405 0.7%
 
M 400 0.7%
 
DANIDA 379 0.6%
 
Government 320 0.5%
 
Ngana water supplied scheme 270 0.5%
 
wanging'ombe water supply s 261 0.4%
 
wanging'ombe supply scheme 234 0.4%
 
Other values (2686) 27093 45.6%
 
(Missing) 28166 47.4%
 
Max length46
Mean length8.94456229
Min length1
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

source
Categorical

Distinct count10
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
spring
17021
shallow well
16824
machine dbh
11075
Other values (7)
14480
ValueCountFrequency (%) 
spring 17021 28.7%
 
shallow well 16824 28.3%
 
machine dbh 11075 18.6%
 
river 9612 16.2%
 
rainwater harvesting 2295 3.9%
 
hand dtw 874 1.5%
 
lake 765 1.3%
 
dam 656 1.1%
 
other 212 0.4%
 
unknown 66 0.1%
 
Max length20
Mean length8.978804714
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

source_class
Categorical

Distinct count3
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
groundwater
45794
surface
13328
unknown
 
278
ValueCountFrequency (%) 
groundwater 45794 77.1%
 
surface 13328 22.4%
 
unknown 278 0.5%
 
Max length11
Mean length10.08377104
Min length7
Contains charsTrue
Contains digitsFalse
Contains spacesFalse
Contains non-wordsFalse

source_type
Categorical

Distinct count7
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
spring
17021
shallow well
16824
borehole
11949
Other values (4)
13606
ValueCountFrequency (%) 
spring 17021 28.7%
 
shallow well 16824 28.3%
 
borehole 11949 20.1%
 
river/lake 10377 17.5%
 
rainwater harvesting 2295 3.9%
 
dam 656 1.1%
 
other 278 0.5%
 
Max length20
Mean length9.303602694
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

status_group
Categorical

Distinct count3
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
functional
32259
non functional
22824
functional needs repair
 
4317
ValueCountFrequency (%) 
functional 32259 54.3%
 
non functional 22824 38.4%
 
functional needs repair 4317 7.3%
 
Max length23
Mean length12.48176768
Min length10
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

subvillage
Categorical

Distinct count19288
Unique (%)32.5%
Missing (%)0.6%
Missing (n)371
Madukani
 
508
Shuleni
 
506
Majengo
 
502
Other values (19284)
57513
ValueCountFrequency (%) 
Madukani 508 0.9%
 
Shuleni 506 0.9%
 
Majengo 502 0.8%
 
Kati 373 0.6%
 
Mtakuja 262 0.4%
 
Sokoni 232 0.4%
 
M 187 0.3%
 
Muungano 172 0.3%
 
Mbuyuni 164 0.3%
 
Mlimani 152 0.3%
 
Other values (19277) 55971 94.2%
 
(Missing) 371 0.6%
 
Max length30
Mean length7.867003367
Min length1
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

ward
Categorical

Distinct count2092
Unique (%)3.5%
Missing (%)0.0%
Missing (n)0
Igosi
 
307
Imalinyi
 
252
Siha Kati
 
232
Other values (2089)
58609
ValueCountFrequency (%) 
Igosi 307 0.5%
 
Imalinyi 252 0.4%
 
Siha Kati 232 0.4%
 
Mdandu 231 0.4%
 
Nduruma 217 0.4%
 
Mishamo 203 0.3%
 
Kitunda 203 0.3%
 
Msindo 201 0.3%
 
Chalinze 196 0.3%
 
Maji ya Chai 190 0.3%
 
Other values (2082) 57168 96.2%
 
Max length23
Mean length7.505841751
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

water_quality
Categorical

Distinct count8
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
soft
50818
salty
 
4856
unknown
 
1876
Other values (5)
 
1850
ValueCountFrequency (%) 
soft 50818 85.6%
 
salty 4856 8.2%
 
unknown 1876 3.2%
 
milky 804 1.4%
 
coloured 490 0.8%
 
salty abandoned 339 0.6%
 
fluoride 200 0.3%
 
fluoride abandoned 17 < 0.1%
 
Max length18
Mean length4.303282828
Min length4
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

waterpoint_type
Categorical

Distinct count7
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
communal standpipe
28522
hand pump
17488
other
6380
Other values (4)
7010
ValueCountFrequency (%) 
communal standpipe 28522 48.0%
 
hand pump 17488 29.4%
 
other 6380 10.7%
 
communal standpipe multiple 6103 10.3%
 
improved spring 784 1.3%
 
cattle trough 116 0.2%
 
dam 7 < 0.1%
 
Max length27
Mean length14.82757576
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

waterpoint_type_group
Categorical

Distinct count6
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
communal standpipe
34625
hand pump
17488
other
 
6380
Other values (3)
 
907
ValueCountFrequency (%) 
communal standpipe 34625 58.3%
 
hand pump 17488 29.4%
 
other 6380 10.7%
 
improved spring 784 1.3%
 
cattle trough 116 0.2%
 
dam 7 < 0.1%
 
Max length18
Mean length13.90287879
Min length3
Contains charsTrue
Contains digitsFalse
Contains spacesTrue
Contains non-wordsTrue

wpt_name
Categorical

Distinct count37400
Unique (%)63.0%
Missing (%)0.0%
Missing (n)0
none
 
3563
Shuleni
 
1748
Zahanati
 
830
Other values (37397)
53259
ValueCountFrequency (%) 
none 3563 6.0%
 
Shuleni 1748 2.9%
 
Zahanati 830 1.4%
 
Msikitini 535 0.9%
 
Kanisani 323 0.5%
 
Bombani 271 0.5%
 
Sokoni 260 0.4%
 
Ofisini 254 0.4%
 
School 208 0.4%
 
Shule Ya Msingi 199 0.3%
 
Other values (37390) 51209 86.2%
 
Max length30
Mean length10.96210438
Min length1
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

Correlations

Missing values

Sample

First rows

amount_tshbasinconstruction_yeardate_recordeddistrict_codeextraction_typeextraction_type_classextraction_type_groupfundergps_heightidinstallerlatitudelgalongitudemanagementmanagement_groupnum_privatepaymentpayment_typepermitpopulationpublic_meetingquality_groupquantityquantity_grouprecorded_byregionregion_codescheme_managementscheme_namesourcesource_classsource_typestatus_groupsubvillagewardwater_qualitywaterpoint_typewaterpoint_type_groupwpt_name
06000.0Lake Nyasa19992011-03-145gravitygravitygravityRoman139069572Roman-9.856322Ludewa34.938093vwcuser-group0pay annuallyannuallyFalse109TruegoodenoughenoughGeoData Consultants LtdIringa11VWCRomanspringgroundwaterspringfunctionalMnyusi BMundindisoftcommunal standpipecommunal standpipenone
10.0Lake Victoria20102013-03-062gravitygravitygravityGrumeti13998776GRUMETI-2.147466Serengeti34.698766wuguser-group0never paynever payTrue280NaNgoodinsufficientinsufficientGeoData Consultants LtdMara20OtherNaNrainwater harvestingsurfacerainwater harvestingfunctionalNyamaraNattasoftcommunal standpipecommunal standpipeZahanati
225.0Pangani20092013-02-254gravitygravitygravityLottery Club68634310World vision-3.821329Simanjiro37.460664vwcuser-group0pay per bucketper bucketTrue250TruegoodenoughenoughGeoData Consultants LtdManyara21VWCNyumba ya mungu pipe schemedamsurfacedamfunctionalMajengoNgorikasoftcommunal standpipe multiplecommunal standpipeKwa Mahundi
30.0Ruvuma / Southern Coast19862013-01-2863submersiblesubmersiblesubmersibleUnicef26367743UNICEF-11.155298Nanyumbu38.486161vwcuser-group0never paynever payTrue58TruegooddrydryGeoData Consultants LtdMtwara90VWCNaNmachine dbhgroundwaterboreholenon functionalMahakamaniNanyumbusoftcommunal standpipe multiplecommunal standpipeZahanati Ya Nanyumbu
40.0Lake Victoria02011-07-131gravitygravitygravityAction In A019728Artisan-1.825359Karagwe31.130847otherother0never paynever payTrue0TruegoodseasonalseasonalGeoData Consultants LtdKagera18NaNNaNrainwater harvestingsurfacerainwater harvestingfunctionalKyanyamisaNyakasimbisoftcommunal standpipecommunal standpipeShuleni
520.0Pangani20092011-03-138submersiblesubmersiblesubmersibleMkinga Distric Coun09944DWE-4.765587Mkinga39.172796vwcuser-group0pay per bucketper bucketTrue1TruesaltyenoughenoughGeoData Consultants LtdTanga4VWCZingibaliotherunknownotherfunctionalMoa/MweremeMoasaltycommunal standpipe multiplecommunal standpipeTajiri
60.0Internal02012-10-013swn 80handpumpswn 80Dwsp019816DWSP-3.766365Shinyanga Rural33.362410vwcuser-group0never paynever payTrue0TruegoodenoughenoughGeoData Consultants LtdShinyanga17VWCNaNmachine dbhgroundwaterboreholenon functionalIshinabulandiSamuyesofthand pumphand pumpKwa Ngomho
70.0Lake Tanganyika02012-10-093nira/tanirahandpumpnira/taniraRwssp054551DWE-4.226198Kahama32.620617wuguser-group0unknownunknownTrue0TruemilkyenoughenoughGeoData Consultants LtdShinyanga17NaNNaNshallow wellgroundwatershallow wellnon functionalNyawishi CenterChambomilkyhand pumphand pumpTushirikiane
80.0Lake Tanganyika02012-11-036india mark iihandpumpindia mark iiWateraid053934Water Aid-5.146712Tabora Urban32.711100vwcuser-group0never paynever payTrue0TruesaltyseasonalseasonalGeoData Consultants LtdTabora14VWCNaNmachine dbhgroundwaterboreholenon functionalImalaudukiItetemiasaltyhand pumphand pumpKwa Ramadhan Musa
90.0Lake Victoria02011-08-031nira/tanirahandpumpnira/taniraIsingiro Ho046144Artisan-1.257051Karagwe30.626991vwcuser-group0never paynever payTrue0TruegoodenoughenoughGeoData Consultants LtdKagera18NaNNaNshallow wellgroundwatershallow wellfunctionalMkonomreKaishosofthand pumphand pumpKwapeto

Last rows

amount_tshbasinconstruction_yeardate_recordeddistrict_codeextraction_typeextraction_type_classextraction_type_groupfundergps_heightidinstallerlatitudelgalongitudemanagementmanagement_groupnum_privatepaymentpayment_typepermitpopulationpublic_meetingquality_groupquantityquantity_grouprecorded_byregionregion_codescheme_managementscheme_namesourcesource_classsource_typestatus_groupsubvillagewardwater_qualitywaterpoint_typewaterpoint_type_groupwpt_name
593900.0Lake Tanganyika19912011-08-042swn 80handpumpswn 80Rudep171513677DWE-8.258160Sumbawanga Rural31.370848vwcuser-group0never paynever payFalse150TruegoodinsufficientinsufficientGeoData Consultants LtdRukwa15VWCNaNmachine dbhgroundwaterboreholefunctionalKitontoMkowesofthand pumphand pumpKwa Mzee Atanas
593910.0Pangani19672013-08-033gravitygravitygravityGovernment Of Tanzania54044885Government-4.272218Same38.044070vwcuser-group0never paynever payTrue210TruegoodenoughenoughGeoData Consultants LtdKilimanjaro3Water authorityHingililiriversurfaceriver/lakenon functionalMaore KatiMaoresoftcommunal standpipecommunal standpipeKwa
593920.0Lake Rukwa02011-04-151gravitygravitygravityGovernment Of Tanzania040607Government-8.520888Chunya33.009440vwcuser-group0never paynever payTrue0TruegoodenoughenoughGeoData Consultants LtdMbeya12VWCNaNspringgroundwaterspringnon functionalMbuyuni AMbuyunisoftcommunal standpipecommunal standpipeBenard Charles
593930.0Internal02012-10-272gravitygravitygravityPrivate048348Private-4.287410Igunga33.866852private operatorcommercial0pay per bucketper bucketFalse0FalsegoodinsufficientinsufficientGeoData Consultants LtdTabora14Water authorityNaNdamsurfacedamfunctionalMasangaIgungasoftotherotherKwa Peter
59394500.0Wami / Ruvu20072011-03-096submersiblesubmersiblesubmersibleWorld Bank35111164ML appro-6.124830Mvomero37.634053vwcuser-group0pay monthlymonthlyTrue89TruegoodenoughenoughGeoData Consultants LtdMorogoro5VWCNaNmachine dbhgroundwaterboreholenon functionalKomstariDiongoyasoftcommunal standpipecommunal standpipeChimeredya
5939510.0Pangani19992013-05-035gravitygravitygravityGermany Republi121060739CES-3.253847Hai37.169807water boarduser-group0pay per bucketper bucketTrue125TruegoodenoughenoughGeoData Consultants LtdKilimanjaro3Water BoardLosaa Kia water supplyspringgroundwaterspringfunctionalKiduruniMasama Magharibisoftcommunal standpipecommunal standpipeArea Three Namba 27
593964700.0Rufiji19962011-05-074gravitygravitygravityCefa-njombe121227263Cefa-9.070629Njombe35.249991vwcuser-group0pay annuallyannuallyTrue56TruegoodenoughenoughGeoData Consultants LtdIringa11VWCIkondo electrical water schriversurfaceriver/lakefunctionalIgumbiloIkondosoftcommunal standpipecommunal standpipeKwa Yahona Kuvala
593970.0Rufiji02011-04-117swn 80handpumpswn 80NaN037057NaN-8.750434Mbarali34.017087vwcuser-group0pay monthlymonthlyFalse0TruefluorideenoughenoughGeoData Consultants LtdMbeya12VWCNaNmachine dbhgroundwaterboreholefunctionalMadunguluChimalafluoridehand pumphand pumpMashine
593980.0Rufiji02011-03-084nira/tanirahandpumpnira/taniraMalec031282Musa-6.378573Chamwino35.861315vwcuser-group0never paynever payTrue0TruegoodinsufficientinsufficientGeoData Consultants LtdDodoma1VWCNaNshallow wellgroundwatershallow wellfunctionalMwinyiMvumi Makulusofthand pumphand pumpMshoro
593990.0Wami / Ruvu20022011-03-232nira/tanirahandpumpnira/taniraWorld Bank19126348World-6.747464Morogoro Rural38.104048vwcuser-group0pay when scheme failson failureTrue150TruesaltyenoughenoughGeoData Consultants LtdMorogoro5VWCNaNshallow wellgroundwatershallow wellfunctionalKikatanyembaNgerengeresaltyhand pumphand pumpKwa Mzee Lugawa